Metadata

Close
Metadata

%0 Conference Proceedings
%4 sid.inpe.br/sibgrapi/2018/09.02.02.43
%2 sid.inpe.br/sibgrapi/2018/09.02.02.43.26
%@doi 10.1109/SIBGRAPI.2018.00061
%T A Machine Learning approach for Graph-based Page Segmentation
%D 2018
%A Maia, Ana Lucia Lima Marreiros,
%A Julca-Aguilar, Frank Dennis,
%A Hirata, Nina Sumiko Tomita,
%@affiliation University of São Paulo/State University of Feira de Santana
%@affiliation University of São Paulo
%@affiliation University of São Paulo
%E Ross, Arun,
%E Gastal, Eduardo S. L.,
%E Jorge, Joaquim A.,
%E Queiroz, Ricardo L. de,
%E Minetto, Rodrigo,
%E Sarkar, Sudeep,
%E Papa, João Paulo,
%E Oliveira, Manuel M.,
%E Arbeláez, Pablo,
%E Mery, Domingo,
%E Oliveira, Maria Cristina Ferreira de,
%E Spina, Thiago Vallin,
%E Mendes, Caroline Mazetto,
%E Costa, Henrique Sérgio Gutierrez,
%E Mejail, Marta Estela,
%E Geus, Klaus de,
%E Scheer, Sergio,
%B Conference on Graphics, Patterns and Images, 31 (SIBGRAPI)
%C Foz do Iguaçu, PR, Brazil
%8 29 Oct.-1 Nov. 2018
%I IEEE Computer Society
%J Los Alamitos
%S Proceedings
%K Page segmentation, document image, machine learning, graph, connected components classification, convolutional neural network.
%X We propose a new approach for segmenting a document image into its page components (e.g. text, graphics and tables). Our approach consists of two main steps. In the first step, a set of scores corresponding to the output of a convolutional neural network, one for each of the possible page component categories, is assigned to each connected component in the document. The labeled connected components define a fuzzy over-segmentation of the page. In the second step, spatially close connected components that are likely to belong to a same page component are grouped together. This is done by building an attributed region adjacency graph of the connected components and modeling the problem as an edge removal problem. Edges are then kept or removed based on a pre-trained classifier. The resulting groups, defined by the connected subgraphs, correspond to the detected page components. We evaluate our method on the ICDAR2009 dataset. Results show that our method effectively segments pages, being able to detect the nine types of page components. Furthermore, as our approach is based on simple machine learning models and graph-based techniques, it should be easily adapted to the segmentation of a variety of document types.
%@language en
%3 Final_PaperID_50.pdf